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Abstract 

The possibility of translating logic programs into functional ones has long been a 
subject of investigation. Common to the many approaches is that the original logic 
program, in order to be translated, needs to be well-moded and this has led to the 
common understanding that these programs can be considered to be the "functional 
part" of logic programs. As a consequence of this it has become widely accepted that 
"complex" logical variables, the possibility of a dynamic selection rule, and general 
properties of non-well-moded programs are exclusive features of logic programs. 

This is not quite true, as some of these features are naturally found in lazy functional 
languages. 

We readdress the old question of what features are exclusive to the logic program- 
ming paradigm by denning a simple translation applicable to a wider range of logic 
programs, and demonstrate that the current circumscription is unreasonably restric- 
tive. 

Keywords: Logic Programming, Functional Programming, Lazy Evaluation. 

ACM Computing Classification System: D.l.l (applicative - functional - program- 
ming); D.1.6 (logic programming); D.3.2 (language classifications) F.3.3 (studies of 
program constructs) programs) 

1 Introduction 



The possibility of translating logic programs into functional ones has long been a subject 
of investigation. Among the different proposals [Mar94, Mar95, GW92, RKS98, Red84, 
vR97]. Such systems are usually devised for one of the following purposes: for proving 
program properties, for providing better insight on the relation between functional and logic 
languages, or - to a minor extent - for improving program performance. 

Common to all the approaches mentioned is that the original logic program, in order 
to be translated, needs to be well-moded and this has led to the common understanding 
that these programs can be considered to be the "functional part" of logic programs. This 
is confirmed by the following statement in [Mar95]: "... the class of functionally moded 



(well-moded and simply modcd) programs can be rightly considered the functional core of 
logic programs" . 

Well-moded programs have, among other features, a straightforward lcft-to-right dataflow 
model (see [AE93, AM94]) and prohibit the use of logical variables to their full potential such 
as in complex logical data structures like difference-lists. As a consequence of this it is now 
widely accepted that "complex" logical variables, the possibility of a dynamic selection rule, 
and general properties of non- well-moded programs are exclusive features of logic programs. 

This is not quite right. At least, not to the extent that one is brought to think. 

In this paper we show, among other things, that logical structures such as difference 
lists have a natural counterpart in lazy functional programs; i.e. that most programs us- 
ing difference-lists are functional in nature. This shows immediately that many common 
non-well-moded programs are functional in nature and that well-modedness is thus not a 
necessary attribute of those logic programs behaving functionally. We do this by employ- 
ing a straightforward - literal - translation of moded logic programs into Haskell, a lazy 
functional language. 

Furthermore, we use the same translation system to show that some programs requiring 
a dynamic scheduling mechanism are also intrinsically functional. 

Summarizing, in this paper we readdress the old question of what features are exclusive 
to the logic programming paradigm and demonstrate that the current circumscription is 
unreasonably restrictive. 

2 Preliminaries 

Due to space constraints we omit preliminaries and assume that the reader is acquainted 
with the terminology and the main results of logic programming theory (see [Apt 90, Llo87]). 
In this paper we use over-lined characters to indicate (a possibly empty) sequence of objects, 
so t can denote a sequence ti, . . . , t n of terms, x a sequence of variables and A a sequence 
of atoms (i.e. a query). To avoid confusion with built-in symbols, we use = to indicate 
syntactic equivalence. 

In what follows we study logic programs executed by means of the LD -resolution, which 
consists of the SLD-resolution combined with the leftmost selection rule. An SLD-derivation 
in which the leftmost selection rule is used is called an LD- derivation. 

2.1 Modes for Logic Programs 

This section is partially borrowed from [AE93] , we refer to the appendix and to [AM94] for 
further information over well-moded logic programs. 

Definition 2.1 Consider an n-ary relation symbol p. By a mode for p we mean a function 
nip from {1, . . . , n} to the set {In, Out}. If m p (i) = In, we call i an input position of p and if 
m p (i) = Out, we call % an output position of p (both w.r.t. m p ). □ 

An n-ary relation p with a mode m p will be denoted by p(m p (l), . . . ,m p (n)). For example, for 
programs member and append we typically have the following listings and modes: 
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mode member (In, In), 
member (El, [El I _] ) . 

member (El, [_|Rest]) ^member (El, Rest), 
mode append (In, In, Out) . 
append( [ ] , _, [ ] ) . 

append ( [Hi Tail] , List, [Hi Tail']) <— append (Tail , List, Tail'). 

Modes indicate how the arguments of a relation should be used. We assume that to each 
relation symbol is associated a unique mode. Multiple modes can be obtained by simply 
renaming the relations. 

In presence of modes, we require the programs and the queries to be somehow naturally 
consistent wrt them. Before we introduce the notion of consistency we have to provide some 
further notation. When writing an atom as p(u, v) we now assume that u is a sequence of 
terms filling in the input positions of p and v is a sequence of terms filling in the output 
positions. Thus, for notational simplicity, we assume that the input positions come first. 

Let us call producing the input position of the head and the output positions of the body 
atoms, and consuming the other positions of a clause, we have the following definition 

Definition 2.2 (Consistent) A clause (query) is consistent iff every variable occurs in at 
least one producing position. □ 

The last LP notion we need is the one of plain program. Here and in the sequel, a set 
of terms is called linear if every variable occurs at most once in it. In other words, a sat of 
terms is linear iff no variable has two distinct occurrences in any of the terms and no two 
terms have a variable in common. 

Definition 2.3 (Plain) A clause po(s , t n+ i) <— pi(si, ti), . . . ,p n (s n , t n ) is called plain 
if 

(i) ti, . . . ,t„ is a linear family of variables; 

(ii) s is linear. 

A query Q is called plain iff the clause q <— Q is, where q is any (dummy) atom of zero arity. 

A program is called plain if every clause of it is. □ 

Thus a plain program is a program in which producing positions are filled in by variables 
and in which a variable occurs in at most one producing position. 

Condition (i) is similar to, though less restrictive, than the one of simply moded programs 
as defined in [AE93] (as we do not impose an ordering constraint). 

Our translation requires programs to be consistent and plain. This is far less restric- 
tive than well-modcdncss plus simple-modcdness, and as we shall see allows us to capture 
a broader segment of functional behaviour found in logic programs. Indeed, it is now per- 
haps too lenient, but suffices for the goals of this paper to broaden the characterization of 
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logic programs. Regarding the (non) restrictiveness of the concepts of plain and consistent 
programs, we have the following: 

Remark 2.4 It is important to realize that most programs are plain 1 , and that non plain 
programs can naturally be transformed into equivalent plain ones, virtually all consistent 
programs are either plain or safely translatable into a plain form. This is also practically 
demonstrated by the fact that the language Mercury employs a pre-processing phase in 
which all programs are translated into a superheterogeneous form (which is very similar to 
the form of plain programs). Concerning (ii), we can always transform a consistent program 
P into an equivalent consistent program P' which satisfies it. For instance, for the member 
program defined above, we can transform its first clause into member (El, [Head I Rest] ) 
<— El == Head. It is also worth noticing that append is already input-linear. □ 

2.2 Haskell Programs 

Our translation system maps logic programs into lazy functional programs, which are written 
in (a subset of) Haskell [HPW92]. The subset we use includes the proposed extension of 
pattern guards [Pey97], which we describe below. 

The programs we are going to generate are built as sets of equations, each of the following 
form: 

f Si . . . Sj | guardi^ guardij = resulti 

I guard lM guard„ iTO = result^i 

| otherwise = result n+ i 

where f is a function symbol, Si . . . Sj are parameters and guard Xi y, otherwise are guard 
qualifiers. The '|' introduces a guard, and the ',' acts as a logical conjunctive. Pattern 
matching may take place on the parameters. Without pattern guards, the guard qualifiers 
would have to be boolean expressions; that is, we would only return resulti if the associ- 
ated guards guards . . . guardij all evaluate to true. The semantics of Haskell dictates that 
definitions and guards are tried in sequential order. 

The situation with pattern guards is somewhat different. In fact, patterns guards can 
also contain let- expressions (which are defined as usual) and pattern-matching expressions 
which are expressions of the form pattern <— term, and whose semantics is the following: if 
term matches with pattern then the variables in pattern are appropriately instantiated, and 
the pattern-matching guard returns true, otherwise it return false. Consider: 

f z let x = g z 

, M <-* 

, y > 10 = (True.l) 

otherwise = (False, 0) 

1 this assertion is substantiated by the fact that most programs are simply moded, as shown by "mini- 
survey" at the end of [AE93] . 
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Here, the tuple (True.l) will only be returned if the two qualifiers in the first guard succeed. 
That is, 1) the argument x can be pattern matched to a list of one element (denoted by [y], 
which also binds y to this element), and 2) the boolean condition y > 10 is true. In all this, 
the value of x is determined by g z. If any of these fail, then the second guard is tried. In 
this case, the special guard otherwise will be tried, which always succeeds. A let qualifier can 
also be introduce recursive bindings; this will become clear in the sequel. A more detailed 
example is presented in Appendix B. 

As explained below, we need to capture the fact that a predicate might succeed (possibly 
returning a computed answer substitution), or fail. To do so, we introduce a new datatype 
Result in our Haskell programs by: 

data Result a — Sue a | Fail 

That is, the datatype Result has two constructors, Fail and Sue, the latter of which can 
be applied to some term. 

Note that the Haskell programs which we generate are not the obvious programs that a 
functional programmer would write - this is not the intention. They do, however, do what 
the logic programmer intended. All of the programs given in this paper can be compiled by 
any Haskell compiler supporting pattern guards. 

We also want to mention that despite the fact that pattern guards are an important 
feature of our translation, we could do without them - at the price of less elegant translation. 
A Haskell compiler will usually regard these as syntactic sugar anyway, and compile them 
into more basic primitives already found in Haskell. This implies that all the statements 
we are going to give in the sequel are true regardless of the availability of a pattern guard 
construct in the target language. 

3 A Translation System 

In logic programming, queries can succeed, loop or fail. This third possibility is of crucial 
importance, since it is often used as a control mechanism. As an example, one can consider 
the following programming scheme: 

p(X) <— generate (X) , test(X). 

Where test verifies that the value produced by generate is appropriate, and failure and 
backtracking take care of the ill-formed terms. Another common scheme is the following 
one: 

p(X) <— test_a, X = 1. 
p(X) <— test_b, X = 2. 

Where test_a and test_b model a typical case statement, and the selection of the right 
branch is done via the failure and backtracking mechanism. 

Nevertheless, relations which are "not supposed to fail" are quite common in logic pro- 
gramming. We say that a relation is "not supposed to fail" if - when called in a "correct" 
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way - produces at least one answer. Examples of such relations are sort, flatten and 
append, (this latter, for calls of the form append(l 1; 1 2 , X), where li and 1 2 are lists, and 
X is a new variable will always produce one answer). 

The ubiquity of predicates which are not supposed to fail is confirmed by the fact that 
Mercury requires the programmer to specify for each relation symbol, whether it might fail 
or not. This information is then used to generate optimized code. 

We do the same thing for our translation, and from now on we assume that the set of 
predicate symbols is partitioned into two disjoint sets, namely 

non-test predicates, which, when called, are expected to produce at least one answer (such 
as append), and 

test predicates, which when called are allowed to report no answer, i.e., to fail immediately 
or succeed (such as member and <). 

Thus, we have the following definition. 

Definition 3.1 A partitioning is a map from the set of predicate symbols into the set 
{test, non-test}. 

Let P be a program and Q be a set of queries, we say that P is correct wrt. Q iff for 
every A G Q every, time that a non-test atom B is selected in a LD-derivation of A in P then 
B has at least one successful LD-derivation. □ 

Thus every program is correct wrt. the trivial partitioning in which all predicates are 
test. Checking correctness is orthogonal to the purposes of this paper, but we should mention 
that it can be done either using abstract interpretation [DLGH97] or on modes and types 
[PR97]; also Mercury employs a system based on modes and types in order to check that 
the programs are consistent (modulo non-termination) wrt. the partitioning provided by 
the programmer. 

The partition into test and non-test predicates exposes the implicit failure mechanism 
present in logic programs. Our translation will transform non-test predicates as ordinary 
functions, but transform test predicates by returning something of the type Result a, allowing 
us to indicate failure. Essentially if a function fails in the logic programming sense, then a 
value of Fail will be returned. Each "value" returned from a test predicate is only every used 
in a function if it is successfully matched against Succ a indicating that no failure occurred(it 
was not Fail). The combination of plain programs with a partition makes it easy to identify 
the logic programs that can be mapped to functions, and which functions to enhance by 
mimicking the implicit failure mechanism. Now, let p be a predicate symbol with mode 




Then p can naturally be translated into a function of type 



p : Ti x • • • x Tj 
p : Ti x • • • x Tj 



(Si x • • • x Sj) if p is a non-test predicate 

Result(Si x • • • x Sj) if p is a test predicate 
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Where T, and Si arc appropriate Haskell types. Here we will not bother further with the type 
that the translated predicate has: the Haskell compiler will be able to infer it autonomously; 
what it is important to see is that the Haskell counterpart of p is a function which maps a 
tuple with i elements into a tuple containing j elements, possibly embedded in the Return 
datatype depending on whether p is a non-test predicate or not. We shall employ the value 
Fail to denote the functional counterpart of failure 2 . 

3.1 The Translation 

Our translation method requires the program to be translated to be consistent and plain. 
Of these conditions, consistency is the only crucial one, in fact as stated in Remark 2.4 it 
is (virtually) always possible to transform a consistent program into an equivalent program 
which is plain; moreover, most programs are plain already. 

Now, we can transform the logic program into a Haskell one via a simple syntactic 
transformation. First, we have to translate variables, terms and predicate symbols; this 
is done in a straightforward way: one just has to respect the syntactic conventions of the 
two languages (uppercase and lowercases, and built-in predicates). Of course, predicate 
symbols are transformed into non-constructor function symbols. In the sequel we use sans- 
serif characters for Haskell constructs and typewriter font for logic programming ones, for 
instance, t, s denote the Haskell counterpart of the LP terms t, s. 

Definition 3.2 (Translation) Let P be a logic program, and 
p(ti,Si) <— Pi,i(ii,i,o M ), . . . ,pi,k 1 (ii,k 1 ,Oi )kl ), qi,i(ui,i, v M ), . . . , qi^ui^, vi.ij. 

P(t n ,s n ) <— Pn,l(in,l,On,l), • • • , ), qn,l(u n ,i, V„,l), • • • .qn.l^Un,!,, V n ,iJ. 

be the set of clauses of P defining predicate p, where the predicates p 1; j are test predicates 
and the predicates q 1; j are the non-test ones. Here we assume that the clauses had been 
renamed apart, i.e., that they share no variables. 

• If p is a test predicate, then the translation of the above section into Haskell is the 
following script (for the moment the underlined parts have to be treated as if the 
underline wasn't there): 

P(x) I (ti)-(x), 

Sue (oi,i) <- pi,i(ii,i), 

Suc(oi jfel ) <— pi^Oi.fcJ, 
let (v M ) = qi,i(u M ), 



2 Confusingly, if a test predicate returns no values then we will return Sue (), where () looks like an empty 
tuple but which is in actual fact the only element in the unit type. 
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let (v lih ) = qi.j^ui,^) 



= Suc(si) 



I (t 2 )-(x), 



= Suc(s 2 ) 



| otherwise = Fail 



If one of the clauses of the above section is a unit clause, (i.e. p(t, Sj).) or if its body 
contains no test predicates then the corresponding line has the trivial guard True. 

Note that sequences may be empty, so if the predicate had no output positions, then 
s would be the term ( ) . 

• If p is a non-test predicate, then translation of the above section corresponds to the 
above script after removal of the underlined parts; namely we have to eliminate from 
it the otherwise statement and the Sue's from the return values. □ 

Clearly, list constructions and built-in predicates need to be handled separately, in par- 
ticular, a test predicate of the form t == s will be transformed to the term which returns 
either Sue () on success and Fail on failure. We will abuse the notation and also call this 
function ==. 

Example 3.3 Let us now consider the program append. It is already plain, so, assuming 
append to be non-test predicate, its translation is: 



Notice that nothing prohibits us from declaring append as a test predicate, if we do so we, 
the result of the translation is: 



append (xl,x2) 



([], list) «-(xl, x2) 



= list 



((x:xs), list) <- (xl, x2) 
let tail' = append (xs, list) 



= x:tail' 



append (xl,x2) 



([], list) ^(xl,x2) 



= Sue list 



((x:xs), list) <- (xl, x2) 
Sue tail' <— append (xs, list) 



= Sue (x:tail') 



otherwise = Fail 
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In practice, the first program is more efficient than the second one (though, this can differ 
per compiler), and it is more lazy than the second one. This is further explained in the 
following aside. 



Remark 3.4 The adopted partitioning has a natural influence on the strictness of the 
resulting Haskell code. Consider the differences in the above translations of append: if 
append is declared as non-test then its translation will contain a let-expression in the guard 



Now, while (1) is a let expression whose bound expression will only be invoked if (and to the 
extent that) the value of tail' is demanded, the second is a guard, which has to be satisfied 
in order for the function it appears in to return a value. Indeed, in (2) the term append 
(xsjist) will always be reduced until it is completely computed, i.e. until it either reaches 
either "success" or "failure". In this sense (2) is strict, while (1) is lazy. 

This behaviour is quite natural if one considers the following: since test atoms might 
fail, we cannot trust their partial answers until we have computed whether they'll succeed 
or not. This implies that they always have to be fully "computed", therefore forcing a strict 
computation. On the other hand non-test predicates are guaranteed to eventually succeed, 
so their computation might be stopped at the moment that we have reached a partial result 
which is "sufficient for our purposes". Therefore the non-test predicates naturally fit the 
lazy model of computation. □ 

4 Logic Programs in a Lazy Functional Language 

We are at last in a position to demonstrate our thesis that the set of logic programs con- 
sidered as functional needs to be expanded. As issues, we consider logic variables, dynamic 
scheduling and backtracking in turn. 

The dynamics of some of the programs we are going to present in this section is un- 
avoidably rather complex, we apologize for the inconvenience and ask the reader to resort 
to patience and understanding. 

4.1 Logical Variables vs. Lazy Evaluation 

Logical variables are one of the peculiarities of logic programming. Most of the time, they 
are used in a standard way, that is just as variables in an imperative language - this is the 
case for instance when the program is wcll-modcd. Nevertheless there are many important 
situations in which logical variables are exploited in all their power. A typical such case is 
in the presence of difference structures such as difference lists. 

Here we show that even when used in a truly "logical" way, logical variables are in many 
cases not an exclusive feature of logic programs. 



let tail' = append (xs, list) 
while if it is declared as test then, in its place, we will find the guard 
Sue tail' <— append (xs, list) 



(1) 



(2) 
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The following Polish Flag Problem example (incidentally, a simplified version of Dijk- 
stra's Dutch Flag Problem), reads as follows: given a list of objects which are either red or 
white, rearrange it in such a way that the red elements appear first and the white ones ap- 
pear after them. The following program is inspired by [O'K90, page 117], we have replaced 
"\" by "," , thus splitting a position filled in by a difference-list into two positions. Because 
of this change in some relations, additional arguments are introduced. 

polishdnList , RedWhites) <— 

distribute (InList, RedWhites, Whites, Whites, [ ]). 

distribute ([ ], Reds, Reds, Whites, Whites), 
distributee [X| Xs] , [XlRedsO], Reds, WhitesO, Whites) 

distribute (Xs , RedsO, Reds, WhitesO, Whites), 
distribute ( [X I Xs] , RedsO, Reds, [X I WhitesO] , Whites) 

distribute (Xs , RedsO, Reds, WhitesO, Whites). 

mode polishdn, Out): non-test. 

mode distribute (In, Out, In, Out, In): non-test. 

Where we assume that predicates red and white are appropriately defined elsewhere in the 
program and have mode red(In) : test. This program is plain and consistent, and by 
translating it we obtain 

polish inlist t <— inlist, 

let (redwhites, whites) — distribute^, whites, [ ]) 
= redwhites 

distributees, rstail, wstail) ([ ],rs,ws) <— (is,rstail,wstail) 

= (rs, ws) 

(x:xs,rs,ws) <— (is, rstail, wstail) 

let (rshead, wshead) = distributees, rs, ws) 

Sue () <— red x 

= (x:rshead, wshead) 

(x:xs,rs,ws) <— (is, rstail, wstail) 

let (rshead, wshead) = distributees, rs, ws) 

Sue () <— white x 

= (rshead, x:wshead) 

Where red and white are again defined elsewhere in the program. This program runs perfectly 
well. Notice that the definition of polish employs a circular data structure: in fact the 
variable whites appears both on the left hand side and on the right hand side of the expression 
let (redwhites, whites) = distribute^, whites, [ ]), in the guard. 

Circular data structures were first advocated by Bird [Bir84] in order to avoid multiple 
traversal of data structures, and since then have become a standard tool of lazy functional 



<- red(X) , 
<- white (X) , 
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languages. It is worth remarking that the above program (and most other programs em- 
ploying circular structures) would not function properly if we had used a strict functional 
language. 

It is important to notice that the original logic program employs logical variables in a 
highly non-trivial way. This is confirmed by the fact that the program is not well-moded. 

The fact that a program using difference-lists actually presents a functional behaviour 
is not incidental. Consider an atom containing a difference-list . . . p(t\s) (for the sake of 
simplicity, we assume that it does not have any other argument), as above, we split this 
position in two, and obtain . . . p(t\s). Now, the whole idea of having difference lists, is 
that when a computation starting in (an instance of) . . . p(t\s) will succeed, it will report 
a computed answer substitution (c.a.s.) 9 such that s9 is ("points to") the tail of t9. This 
implies that for all a if s9a is ground, then t9a is ground as well. Typically, after . . .p(t\s) 
has succeeded with c.a.s. 9, s9 will eventually be unified with a ground (classical) list (or 
with the head of another difference-list, in which case the reasoning continues by considering 
the tail of this second difference-structure). After this unification has taken place, t9 is going 
to be a classical list, which can be employed as normal. In this sense we have that t depends 
on s, therefore t has to be considered output and s input, and the above atom should be 
translated into let s = p(t); the only problem is that s is an input in disguise, in the sense 
that when p(t,s) is called, s is typically not (yet) ground (in other words, it is not yet 
known). However, this is hardly a problem when we consider lazy functional languages (at 
the same time, it is the reason why programs with difference lists cannot be easily translated 
into a strict functional language). 

One could argue that the above reasoning could be completely reversed, starting by 
saying that ". . . for all a if t9a is ground, then s9a is ground as well, . . ., thus if t9 is 
unified to a ground list, then s9 will become a ground list, and this shows that s depends 
on t, and that therefore the above atom should be translated into s = p(t)" (which would 
fail to function). This is in principle true (dependencies in LP are always bidirectional), 
however, this property is never used, indeed it cannot be used in practice for the following 
simple reason: after succeeding with c.a.s. 9, we typically have that t9 = [ai, a 2 , . . . , a k |X] 
and that s9 = X. Now, while it is always possible to unify s9 with any ground list 1, trying 
to do this with t9 will almost certainly lead to failure (unless [a i7 a 2 , . . . , a k ] is a prefix of 
l). Therefore difference-lists are virtually always employed in a directional fashion. 

Another example of a program using logical variables which can be safely translated 
into Haskell is given in the following section. Of course, one can find an example of a 
program using difference-lists which would not work in Haskell; actually, counterexamples 
are extremely easy to contrive: variables in LP are always adirectional, and if one fully 
exploits this will always obtain a program which has no functioning functional counterpart. 
We don't want to deny this, on the contrary: here we are interested in how programs are 
usually used, and in pointing out that some standard methodologies which are normally 
considered as applicable only to LP, are actually not so. 

Of course not all programs using logical variables are translated correctly: typical such 
examples are the programs which incrementally fill in a data structures such as in the eight 
queen example and in the SEQUENCE example in Appendix C (these programs use unification 
in a crucial way, and this is confirmed by the fact that they are not consistent). Other 
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examples are circular programs such as the following one 

p(X) ^eq(X, X). 
eq(X, X). 

moded as follows: p(In: Ground) :non-test and eq ( Out : Ground, Dut:Ground) :non-test. 

This program is circular in a non-well-founded way, and this, when translated, yields a 
program which is not productive. 

We can safely conclude that difference lists have a natural counterpart in the circular 
structures of lazy functional programming. 

4.2 Dynamic scheduling vs. Lazy Evaluation 

Another prominent property of logic programming is the possibility of having a dynamic se- 
lection rule, possibly guided by appropriate delay declarations. Let us consider the following 
example, which, given the list Xs of integer values, del_max(Xs,Zs) produces the list Zs by 
deleting all the occurrences of its maximum element. 

del_max(Xs, Zs) <— f ind_max_and_del (Xs , Max, Zs, Max). 

°/« f ind_max_and_del(InList ,E1 , DutList ,Max) 

% Max is the maximum element of the list InList, and 

7, OutList is obtained from Inlist by deleting all the occurrences of El from it 

f ind_max_and_del( [ ] , _, [ ] , 0) . 

f ind_max_and_del( [X I Xs] , El, Ys, Max) <— 

f ind_max_and_del (Xs , El, Zs, Max'), 

sup(X, Max' , Max) , 

del_if_f irst( [X I Zs] , El, Ys) . 

del_if _f irst ( [EL I Zs] , El, Zs) . 
del_if_first([X I Zs], El, [X|Zs]) <- X ^ El . 

mode del_max(In:List [Int] , Out : List [Int] ) : non-test. 

mode f ind_max_and_del(In : List [Int] , In : Int , Out : List [Int] , Out : Int) : non-test . 

mode sup (In: Int, In: Int, Out: Int): non-test. '/, defined in the obvious way 

mode del_if_f irst(In:List [Int] , In:Int, Out : List [Int] ) : non-test. 

It is worth noticing that the program uses logical variables in a nontrivial way. This is 
confirmed by the fact that it is not well-moded. Specifically, the variable Max in the first 
clause is used as an asynchronous communication channel between processes, as the atom 
f ind_max_and_del (Xs , Max , Max , Zs) uses Max as input value that it has to produce itself. 

Furthermore, the program requires an appropriate dynamic scheduling. In fact, when 
run with a standard lcft-to-right selection rule, the query del_max(ts, Zs) (ts being a list 
of natural numbers) leads to a run-time error (or to an incorrect answer), and, provided 
that we fix this problem, to a very inefficient computation. 

The first problem (concerning the runtime error) is due to the fact that the computation 
will soon a goal of the form del_if Jirst (ts , El, Zs), where ns (= [n|ns']) is a non- 
empty list of integers, and El and Zs are distinct variables. At that point the interpreter 
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will proceed and might reach the call n ^ El, which - being El a variable - will flounder 3 . 
In other words, this program cannot be run with the normal leftmost selection rule. 

The second problem (concerning program's inefficiency) is due to the fact that the query 
del_max(Xs, Zs) could return the list Zs in linear time (scanning Xs only once), however, 
it is easy to see that if we employ any fixed selection rule, the program has to go through a 
remarkable amount of backtracking, which makes it run in quadratic time on the length of 
the input list 4 . 

Both problems can be solved by employing a dynamic selection rule and by prohibiting 
the selection of certain atoms until their arguments are sufficiently instantiated using for 
instance the following delay declarations [Nai82]: 

delay sup(X, Y, _) until ground(X) A ground(Y) . 
delay ^(X,Y) until ground(X) A ground (Y) . 

delay del.if Jirst ( [X I Xs] , El, _) until ground(X) A ground(El) 

For instance, the first declaration will suspend any call to sup(t, s, v) . until t and s 
are ground terms. Delay declarations have become an important standard control tool and 
arc implemented in various versions of Prolog (for instance in Sixtus Prolog and in Eclipse 
[WV93]) and in the language Godel [HL94]. 

Now, let us for a moment not bother about the delay declarations and translate this 
program into Haskell. We obtain the following script. 

deLmax as xs <— as 

let (zs, maxel) = find_max_and_del (xs, maxel) 
= zs 

find_max_and_del (xl,x2) | ([ ], el) <- (xl, x2) 

= ([].0) 

(x:xs, el) <— (xl,x2) 

let (zs, maxel') = find_max_and_del (xs,el) 
let maxel = max x maxel' 
let ys = delJLfirst (x:zs, el) 
= (ys, maxel) 

delJLfirst (xl, x2) | ([ ], el) <- (xl, x2) 

3 In practice, the behaviour of the interpreters in these situations are different, this depends on the non- 
logical behaviour of the built-ins of PROLOG and on whether we employ == in order to make the program 
plain and \== in order to implement 7^. For instance in Eclipse and SWI-Prolog the call 11 == El fails, 
and a subsequent call n \== El succeeds (!) in Eclipse by returning the empty c.a.s. and in SWI-prolog by 
instantiating El to an apparently random numeric value. Of course, this behaviours lead to solutions which 
are almost always incorrect. 

4 Informally, the reason is that the interpreter will run into some calls of the form del_if _f irst(. . . , El, Zs), 
where El is still a variable. For this reason del_if _f irst(. . . , El, Zs) will take a guess and delete an element 
which will usually turn out not to be the correct one, and this will eventually produce backtracking. Since 
the number of needed backtracking steps is linear in the size of the input list, this decreases the performance 
of the program from linear to quadratic time. In order to avoid this problem, we have to make sure that an 
atom of the form del.if _first(_, El, _) will be selected in the derivation only when El will be instantiated 
to a ground term. We can do this by employing the above delay declarations. 
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= [] 

(x:xs, el) 
x == el 



<- (xl,x2) 



= zs 



(x:xs, el) 
, x /= el 



<- (xl,x2) 



= x:zs 



This program works fine, and his runtime complexity is linear in the size of the input. 
We can therefore state that the lazy computational mechanism compensates for the lack of 
control over dynamic scheduling, without which the above logic program could not be run 
or would have a quadratic complexity. 

Thus although the mechanism of lazy evaluation and delay declarations are quite different 
(actually, they are the opposite: the call-by-need mechanism determines which term has to 
be reduced, while delay declarations determine which atoms should not be resolved), they 
often accomplish the same thing. 

The fact that lazy evaluation here plays a crucial role is confirmed by the fact that, if 
we had declared all predicates to be test predicates (thus forcing strictness, as explained in 
the Remark 3.4) the translated program would not function properly. 

Thus again we are in presence of a program exploiting logical variables in a complex way 
which nevertheless has a natural translation into Haskell. 

4.3 Backtracking and Nondeterminism 

Another outstanding feature of logic programs is their backtracking mechanism, which vir- 
tually implements a don't know nondeterministic system. 

In the light of the above examples, we believe that nondeterminism is by far the most 
important and the mostly used peculiar feature of the logic programming paradigm. We 
don't want to challenge this, on the contrary. At the same time, it is important for us to 
show to which extent a (lazy) functional program can mimic a logic program which uses 
backtracking. 

Consider the following program. 

backtracker (X) :- producer_a(Y) , pickyjnodif ier(Y,X) . 

backtracker (X) :- producer_b(Y) , pickyjnodif ier (Y,X) . 

producer_a("a") . 

producer_b("b") . 

picky jnodif ier ("b" , "c") . 

The adopted mode and partitioning is 
backtracker (Out) : non-test 

producer_a(Dut) : non-test ( and the same for producer_b ) 
picky jnodif ier (In, Out) : test 
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Its translation is the following: 

backtracker let y = producer_a 

Sue x <— picky_modifier y 

= x 

let y = producer.b 

Sue x <— picky_modifier y 

= x 

producer.a = "a" 
producer.b = " b" 

picky_modifier x x == " b" 

= Sue "c" 

otherwise = Fail 

The Haskell translation is able to report all the correct answers, even though in LP 
for the query <— backtracker (X) in order to return the answer X = "a", the interpreter 
has to go through some backtracking. Notice in fact that the above logic program is not 
deterministic. 

Consider now the following program scheme: 

p(X) <- generate (X) , test (X) . 

It is immediate to translate it and to check that if generate has more than one solution 
then the translation does not behave as the logic program does: while the query : - p (X) 
succeeds provided that one of the solutions of generate (X) satisfies test(X), the Haskell 
translation manages to report one answer only so in the unlikely case that the first solution 
founded by generate (X) satisfies test(X); in all other cases p reduces to Fail. 

The key factor for the translation to work correctly we need to avoid logic programs 
in which consistent queries might originate SLD trees with more than one successful (sub- 
) branch. There exists techniques based on list-comprehension in order to translate logic 
programs into functional programs in such a way that the resulting program will (eventually, 
lazily) report the list of all the answers that the initial logic program would. In those cases, 
however, one can clearly not talk of a literal translation, which is the starting point of our 
research (programs able to return more than one answer are in our opinion intrinsically logic 
programs, and therefore do not belong to our target). 

To be precise, a non-deterministic logic program can be safely translated onto Haskell 
provided it is input discriminative, as defined as follows: 

Definition 4.1 (Input Discriminative) Let P be a program, M P be its least Herbrand 
model, and 

pi(ii, 01) <— testi , rest i . 
Pn(inj On) ^ test n , rest^ . 
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Be the complete set of the rules of P (where the conjunction test , contains only test predi- 
cates and resti contains only non-test predicates). We say that P is input discriminative if 
for each j ^ k G [l,n], such that pj = p^ we have that 

• for each ground 9 such that ij9 = ikO we have that M P |= -i (test jO A testfc#). □ 

It is worth noticing that this concept of input discriminative program is rather less re- 
strictive than the concept of deterministic program, and that input-discrimitative programs 
might still require non-trivial (non-shallow, see [SS86, Ch. 6]) backtracking. We could say 
that these programs admit some shallow nondeterminism. 

Summarizing, there is a point to be remarked, that - strictly speaking - the Haskell 
translation of a program can always mimic the backtracking taking place in the original 
logic program. What the Haskell translation canned do is report multiple answers. 

Failure, Nondeterminism and Related Work 

The feature of logic programs of being able of reporting more than answer, and how this is 
handled in the different translation systems is a topic which deserves a separate discussion. 

Regarding this issue, the literature on papers presenting a translation from logic to 
functional programs can be divided in two main groups. 

On one side we find papers which are not concerned with the nondeterminism (or the 
backtracking) mechanism of logic languages [Mar94, GW92, RKS98, vR97], these papers are 
usually mainly concerned in providing a transformation system which allows one to prove 
program properties such as termination of the original logic program. For this reason they 
focus on obtaining a translation which maps only the non-failing computations correctly. In 
these papers the failure and backtracking mechanism are disregarded during the translation. 

On the other side, we find [Mar95, Red84], in which the authors propose a translation 
in which the full (PROLOG-like) computational mechanism is preserved, including the pos- 
sibility of having multiple answer for the same query and the possibility of failure. This is 
achieved by letting a query return the list of computed answer substitutions, where the empty 
list corresponds to the failing case, in the same way advocated by Wadler [Wad85]. The 
lazy computational mechanism then takes care of computing only those answers which are 
necessary, and backtracking is faithfully rendered by a standard list-comprehension schema. 

The translation system we have employed lies somewhere in the middle between those two 
methods. Our goal was to take also failure into account, yet retaining a literal translation 
system, in which the computational mechanism of the resulting functional program is as 
similar as possible to the one of the original logic program. 

Of course we can only correctly translate programs which do not return more than one 
answer for the same query (at the same time, it is important to notice that these programs 
don't have to be deterministic; for instance member is nondctcrministic). 

In our opinion, the possibility of returning more than one answer is to be considered 
a peculiar one of the LP paradigm, and the fact that it can be emulated by functional 
programs does not obliterate our position. 
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5 Conclusions 



The goal of our research was to investigate to which extent some features considered peculiar 
of the logic programming paradigm are really so. For this purpose we have devised a simple 
- literal - system which enabled us to translate logic programs into the lazy functional 
language Haskell. 

It is known (see also [Mar94, GW92, RKS98, vR97]) that if we restrict our attention to 
non- failing, non-backtracking computation then well-moded simply moded programs have a 
natural counterpart in a functional language. The properties of being well-moded and simply 
moded indicates a manner of use of variables in logic programming which is undoubtedly 
"functional" . To this statement we want to add that well- and simply moded programs can 
be considered as strictly functional, as they can be safely translated into a strict functional 
language. 

In this paper we have shown that in a lazy functional language, this picture broadens 
significantly, and some of the features that were - in the light of the results above - commonly 
considered as exclusive of the logic programming paradigm, can be naturally found in a lazy 
functional language such as Haskell. 

In particular, we have shown that the use of complex logical variables in data structures 
such as difference lists (or such as in program in del_max) find a natural counterpart in 
the circular structures [Bir84] of lazy functional programs. These structures that were 
commonly considered as "structurally logical" are thus not so. We can then attempt a 
rough classification of logic programs according to the level of complexity at which they 
employs their variables (backtracking and nondeterminism is not considered here). We then 
have the following division. 

(i) Strictly Functional programs which use variables in a standard (imperative-like) way. 
These are characterized by being well-moded (or by being so after permutation of the 
clause's body atoms). 

(ii) Lazy Functional programs which admit a safe translation into Haskell: i.e. programs 
which can be translated into Haskell (via the syntactic translation) and whose opera- 
tional behaviour is isomorphic to the one of their functional counterpart. 

(iii) Intrinsically Logical programs which do not admit a safe translation into Haskell with 
our translation scheme. 

This raises the interesting question of how large is the class of intrinsically logical programs. 
Without pretending to be able to characterize extensively this limit, it is interesting to 
notice that programs which are plain and consistent and which either admit a Layered Mode 
[EG96], or are S- well- typed programs [BM97] are safely translatable into Haskell (modulo 
the possibility of backtracking, which is discussed in the sequel). As argued in [EG96], we 
believe that these programs actually encompass the majority of actual programs which use 
logical variables in a non-elementary way. We think that a classification and understanding 
of these levels might be useful both to enhance the performance of logic languages (as already 
done to some extent in the language Mercury) and to prove more precise program properties. 
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Furthermore, we have also addressed another logical feature: the possibility of dynamic 
scheduling. In theory in LP any atom is selectable as all selection rules yield the same 
successful derivations. In practice this does not work, and adopting a random selection rule 
would in the best case yield to an explosion of the search space; for this reason PROLOG 
uses a fixed left-to-right selection rule, a feature which is cither explicitly or implicitly 
always exploited by the programmers. However, some programs (like delmax above, or 
concurrent-like programs) are not correct under a fixed search rule. In these cases the 
"right" selection strategy is enforced by the use of appropriate delay declarations (d.d.), 
which serve to indicate which atoms in a query should not be resolved. The implementation 
of d.d. is rather costly, as atoms are continuously being suspended and forced. Here we 
have seen one example in which the lazy evaluation mechanism of Haskell achieves the 
same effect of the use of d.d.. As we have pointed out, call-by-need can be regarded as a 
dynamic selection strategy, which is however based on an principle opposite to the one of 
d.d. in the sense that call-by need determines which term has to be reduced, while delay 
declarations determine which atoms should not be resolved. A naturally arising question 
here is whether it is possible to implement in logic programming languages a selection rule 
which is "driven" by a call-by-need mechanism, instead of "restricted" by the use of delay 
declarations. This could possibly lead to reduction of the suspension overhead and thus to 
performance improvements. The difficulty in implementing such a search rule lies in the fact 
that in LP it is not clear which output values depend on which input values (actually, it is 
not clear what is input and what it is output to start with), so in order to implement such an 
intelligent selection strategy, one would need some sophisticated analysis tools which might 
either be based on abstract interpretation (with tools similar to the ones of [CDG93]), or on 
refined versions of modes such as the ones described in [BM97, EG96] . Other works related 
to this subjects are [LK92, EvR98]. 

We have also discussed the fact that logic programs allow backtracking. We have seen 
that - strictly speaking - backtracking computations can be easily mimicked by the func- 
tional language by an appropriate use of the guards; what cannot be (easily) mimicked in 
Haskell is the possibility of returning multiple answers, at least not unless one uses additional 
constructs such as the list-of-successes method [Wad85]. An interesting research direction 
might be to define appropriate monadic structures (such as those in [Wad92]) to capture 
the failure or success and returning of multiple arguments. This would broaden the set of 
logical programs which we can capture with our simple translation scheme, without adding 
signifant complexity to it. 

In conclusion, we have demonstrated with a simple, literal translation scheme that sev- 
eral features considered as belonging specifically to logic programming are found naturally 
in lazy functional programming, dismissing the folklore that the functional core of logic 
programming is contained in the set of well- and simple moded programs. 
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A Well-Moded Programs 

The following concept is essentially due to Dcmbinski and Maluszynski [DM85] ; we use here 
an elegant formulation due to Rosenblueth [Ros91]. 

Definition A.l A clause po(to,s n +i) <— pi(si, ti), . . . ,p n (s n , t n ) is called well-moded if 
for i e [l,n+ 1] 

i-1 

Var(si) C (J Var (t 3 ). 

3=0 

A query A is called well-moded iff the clause q <— A is, where q is any (dummy) atom of 
zero arity. 

A program is called well-moded if every clause of it is. □ 

It is important to notice that the first atom of a well-moded goal is ground in its input 
positions and a variant of a well-moded clause is well-moded. Furthermore, the notion of 
of well-modedness, is "persistent", as shown by the following Lemma. Recall that a LD- 
resolvent is a resolvent in which the leftmost atom per the query is the selected one, and 
that an LD-derivation is a derivation obtained employing the leftmost selection rule, 

Lemma A. 2 An LD-resolvent of a well-moded goal and a well-moded clause that is variable- 
disjoint with it, is well-moded. □ 

The next result is originally due to Dcmbinski and Maluszynski and follows directly from 
the definition of well-moded program. 

Corollary A. 3 Let P and A be well-moded, and let £ be an LD-derivation of A in P. All 
atoms selected in £ contain ground terms in their input positions. □ 

That is, in presence of well-moded programs and queries, if we use a left-to-right compu- 
tation schema we are sure that every time that we select an atom, the "value" of his input 
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arguments has already been fully computed. This shows that well-moded programs have a 
straightforward left-to-right data-flow. 

Under certain conditions well-moded programs are also unification-free. To show this, 
we need a definition first. The following notion was first defined in [AE93]. 

Definition A. 4 A clause p (s , t n+ i) <— pi(si,ti), . . . ,p n (s n ,t n ) is called simply moded if 
if ti, . . . ,t n is a linear family of variables and for t£ 

i 

Var(ti) n (|J Var(sj)) = 0. 
j=o 

A query A is called simply moded iff the clause q <— A is, where q is any (dummy) atom of 
zero arity. 

A program is called simply moded if every clause of it is. □ 

Thus, assuming that in every atom the input positions occur first, a clause is simply 
moded if all output positions of every body atom are filled in by distinct variables, which 
do not occur earlier in the body nor in an input position of the head. 

It is worth noticing that - as shown by the little survey in [AE93] - most programs are 
already simply-moded and that often non simply-moded programs can naturally be trans- 
formed into simply-moded ones, for instance the non-simply-moded clause last(List, El) : 
— reverse(List, [El | _] ) . can be transformed into last(List, El) : — reverse(List, List'), [El | _] = 
List'. 

The property of being simply moded is also "persistent" in the sense that the resolvent 
of a simply moded query with a simply moded clause is simply moded. 

In [AE93] it is proven that if the program and the query are simply moded, then they 
generate an LD-derivation which is unification-free, i.e. that each time an atom A is selected 
and resolved in it via a clause H <— B, then the unification of A and H does not really require 
a full unification algorithm, but can always be reduced to a double matching: one ( "from" A 
"to" H) for the input positions and a second one ("from" H "to" A) for the output ones. This 
result clearly shows that simply and well-moded logic programs are functional in nature 
(besides for the possibility of reporting multiple answers, of course). 

B More on Haskell 

The following (nonsense) program embodies most of the concepts we use: 

append ([ ],x) = Sue x 

append (xl,x2) (x:xs) <— xl 

, Sue tail <— append (xs, x2) 
, let newtail = (x:tail) 
= Sue newtail 
otherwise = Fail 
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In a call of append (a,b), the first equation will be tried first. Here, a will be pattern 
matched to the empty list. If this succeeds, then x is matched to b (since both are variables, 
this will always succeed), and finally Sue x is returned. If the pattern match above failed, 
then the first guard will be tried, which tries to pattern match the first element of the tuple 
to a list with at least one element. Should this succeed, the result of a recursive call to 
append is matched against Sue tail, and if successful (x:tail) is bound to the variable newtail, 
followed by the returning of Sue newtail. If either of the pattern matches failed, then the 
second guard will be tried. 



C program SEQUENCE 



This example is provided by the Prolog formalization of a problem from Coelho and Cotta 
[CC88, pag. 193]: arrange three l's, three 2's, three 9's in sequence so that for all i G [1,9] 
there are exactly i numbers between successive occurrences of i. 



sublist (Xs, Ys) 
sublist (Xs, Ys) 



Xs is a sublist of the list Ys . 
app(_, Zs, Ys) , app(Xs, _, Zs) . 

sequence (Xs) <— Xs is a list of 27 elements. 

sequence ( [_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_]) 



question(Ss) 
question(Ss) <— 

sequence (Ss) , 

sublist ( [1 ,_, 1 , 

sublist ( [2,. 

sublist ( [3,. 

sublist ( [4,. 

sublist ( [5 , . 

sublist ( [6,. 

sublist ( [7,. 

sublist ( [8,. 

sublist ( [9,. 



Ss is a list of 27 elements forming the desired sequence. 



1], Ss), 
2] , Ss) , 
3], Ss), 
4], Ss), 



5] , Ss), 
_,_,6] , Ss) , 

7], Ss), 

8] , Ss) , 
9] , Ss) 



augmented by the append program. 
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